Skip to content

[Refactor] New rule parser to support AST#104

Merged
colinthebomb1 merged 8 commits intomainfrom
colin/rule-parser-parse-v2
Apr 9, 2026
Merged

[Refactor] New rule parser to support AST#104
colinthebomb1 merged 8 commits intomainfrom
colin/rule-parser-parse-v2

Conversation

@colinthebomb1
Copy link
Copy Markdown
Collaborator

Summary

This PR introduces and stabilizes RuleParser v2 along with dedicated v2 test coverage.

Highlights

  • New RuleParserV2 that works with our AST Nodes.
  • Similar structure to original rule parser, but added step for substituing variables with corresponding nodes.
  • Aligns v2 variable semantics and node naming for clarity:
    • element variables (<x>)
    • set variables (<<y>>)
  • New comprehensive v2 tests in test_rule_parser_v2.py.

Testing

  • Ran tests/test_rule_parser_v2.py
  • Ran full test suite to confirm compatibility with existing behavior
  • All tests passing

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces RuleParserV2, a new rule parsing implementation that generates AST node structures instead of JSON, along with a comprehensive refactoring of variable node classes.

Changes:

  • New RuleParserV2 module that parses rules into AST structures with ElementVariableNode and SetVariableNode for rule variables
  • Renamed VarNode → ElementVariableNode and VarSetNode → SetVariableNode throughout the codebase for semantic clarity
  • Updated internal token naming from V/VL to EV/SV (ElementVariable/SetVariable) to align with new v2 semantics
  • Comprehensive test coverage (745 lines) for RuleParserV2 including parametrized tests over entire rule catalog
  • Code cleanup in test_rule_parser.py (import consolidation and formatting)

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
core/rule_parser_v2.py New RuleParserV2 implementation with 6-step pipeline for rule parsing and AST generation
tests/test_rule_parser_v2.py Comprehensive test suite covering extendToFullSQL, replaceVars, parsing, variable substitution, and rule catalog validation
core/ast/node.py Renamed VarNode/VarSetNode to ElementVariableNode/SetVariableNode with updated docstrings
core/ast/init.py Updated exports to use new class names
tests/test_rule_parser.py Import consolidation and code cleanup (formatting and indentation fixes)
tests/test_ast.py Updated imports to use new ElementVariableNode/SetVariableNode names
tests/ast_util.py Updated imports and comments to reference new variable node class names
core/query_parser.py Updated imports and TODO comment to reference new class names

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Collaborator

@HazelYuAhiru HazelYuAhiru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall LGTM! Just with a small comment.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@colinthebomb1 colinthebomb1 requested a review from baiqiushi April 8, 2026 19:17
Copy link
Copy Markdown
Contributor

@baiqiushi baiqiushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Overall the PR is well-structured — clean renames, solid test coverage (774 lines), and all prior review threads resolved. One correctness issue found:

🐛 Variable namespace collision in replaceVars (core/rule_parser_v2.py:194-197)

replaceVars processes set variables (<<x>>) before element variables (<x>), but both passes share the same mapping dict. When a rule uses the same variable name for both types (e.g., <<x>> and <x>), the element variable pass is silently skipped because if var not in mapping (line 181) sees the name already registered from the set variable pass.

Example:

Input:    "SELECT <<x>>, <x> FROM t"
Expected: "SELECT SV001, EV001 FROM t"
Actual:   "SELECT SV001, <x> FROM t"   ← <x> left as literal text

The leftover <x> is invalid SQL and will cause the downstream mo_sql_parsing step to fail or produce a wrong AST.

Suggested fix: Key the mapping by (var_name, var_type) tuples instead of bare name, or use separate mapping dicts per variable type and merge them afterward. The downstream _substitute_placeholders (which consumes the reverse mapping) would also need updating accordingly.

Copy link
Copy Markdown
Contributor

@baiqiushi baiqiushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AI reviews can be resolved by assuming the variable name cannot be overlapped even between different variable types.

@colinthebomb1 colinthebomb1 merged commit a6a198b into main Apr 9, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants